Search CORE

99 research outputs found

Sparser Johnson-Lindenstrauss Transforms

Author: Kane Daniel M.
Nelson Jelani
Publication venue
Publication date: 05/02/2014
Field of study

We give two different and simple constructions for dimensionality reduction in

\ell_2

via linear mappings that are sparse: only an

O(\varepsilon)

-fraction of entries in each column of our embedding matrices are non-zero to achieve distortion

1+\varepsilon

with high probability, while still achieving the asymptotically optimal number of rows. These are the first constructions to provide subconstant sparsity for all values of parameters, improving upon previous works of Achlioptas (JCSS 2003) and Dasgupta, Kumar, and Sarl\'{o}s (STOC 2010). Such distributions can be used to speed up applications where

\ell_2

dimensionality reduction is used.Comment: v6: journal version, minor changes, added Remark 23; v5: modified abstract, fixed typos, added open problem section; v4: simplified section 4 by giving 1 analysis that covers both constructions; v3: proof of Theorem 25 in v2 was written incorrectly, now fixed; v2: Added another construction achieving same upper bound, and added proof of near-tight lower bound for DKS schem

arXiv.org e-Print Archive

Harvard University - DASH

Optimality of the Johnson-Lindenstrauss Lemma

Author: Larsen Kasper Green
Nelson Jelani
Publication venue
Publication date: 08/11/2017
Field of study

For any integers

d, n \geq 2

and

1/({\min\{n,d\}})^{0.4999} < \varepsilon<1

, we show the existence of a set of

n

vectors

X\subset \mathbb{R}^d

such that any embedding

f:X\rightarrow \mathbb{R}^m

satisfying

\forall x,y\in X,\ (1-\varepsilon)\|x-y\|_2^2\le \|f(x)-f(y)\|_2^2 \le (1+\varepsilon)\|x-y\|_2^2

must have

m = \Omega(\varepsilon^{-2} \lg n).

This lower bound matches the upper bound given by the Johnson-Lindenstrauss lemma [JL84]. Furthermore, our lower bound holds for nearly the full range of

\varepsilon

of interest, since there is always an isometric embedding into dimension

\min\{d, n\}

(either the identity map, or projection onto

\mathop{span}(X)

). Previously such a lower bound was only known to hold against linear maps

f

, and not for such a wide range of parameters

\varepsilon, n, d

[LN16]. The best previously known lower bound for general

f

was

m = \Omega(\varepsilon^{-2}\lg n/\lg(1/\varepsilon))

[Wel74, Lev83, Alo03], which is suboptimal for any

\varepsilon = o(1)

.Comment: v2: simplified proof, also added reference to Lev8

arXiv.org e-Print Archive

Crossref

Recommended from our members

Sketching and streaming algorithms for processing massive data

Author: Nelson Jelani
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 21/01/2015
Field of study

The rate at which electronic information is generated in the world is exploding. In this article we explore techniques known as sketching and streaming for processing massive data both quickly and memory-efficiently.Engineering and Applied Science

Harvard University - DASH

Bounded Independence Fools Degree-2 Threshold Functions

Author: Diakonikolas Ilias
Kane Daniel M.
Nelson Jelani
Publication venue
Publication date: 01/01/2009
Field of study

Let x be a random vector coming from any k-wise independent distribution over {-1,1}^n. For an n-variate degree-2 polynomial p, we prove that E[sgn(p(x))] is determined up to an additive epsilon for k = poly(1/epsilon). This answers an open question of Diakonikolas et al. (FOCS 2009). Using standard constructions of k-wise independent distributions, we obtain a broad class of explicit generators that epsilon-fool the class of degree-2 threshold functions with seed length log(n)*poly(1/epsilon). Our approach is quite robust: it easily extends to yield that the intersection of any constant number of degree-2 threshold functions is epsilon-fooled by poly(1/epsilon)-wise independence. Our results also hold if the entries of x are k-wise independent standard normals, implying for example that bounded independence derandomizes the Goemans-Williamson hyperplane rounding scheme. To achieve our results, we introduce a technique we dub multivariate FT-mollification, a generalization of the univariate form introduced by Kane et al. (SODA 2010) in the context of streaming algorithms. Along the way we prove a generalized hypercontractive inequality for quadratic forms which takes the operator norm of the associated matrix into account. These techniques may be of independent interest.Comment: Using v1 numbering: removed Lemma G.5 from the Appendix (it was wrong). Net effect is that Theorem G.6 reduces the m^6 dependence of Theorem 8.1 to m^4, not m^

arXiv.org e-Print Archive

CiteSeerX